{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# MLflow Tracing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Overview\n",
    "In this document, we explain how to set up Guardrails with MLflow Tracing. With this functionality enabled, you can collect additional insights on how your Guard, LLM, and each validator are performing directly in your own Databricks workspace.\n",
    "\n",
    "In this notebook, we'll be using a local MLflow Tracking Server, but you can just as easily switch over to a [hosted Tracking Server](https://mlflow.org/docs/latest/getting-started/tracking-server-overview/index.html#method-3-use-production-hosted-tracking-server).\n",
    "\n",
    "For additional background information on Mlflow Tracing, see the [MLflow documentation](https://mlflow.org/docs/latest/llms/index.html#id1)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Installing Dependencies\n",
    "\n",
    "Let's start by installing the dependencies we'll use in this exercise.\n",
    "\n",
    "First we'll install Guardrails with the `databricks` extra.  This will include the [mlflow](https://pypi.org/project/mlflow/) library and any other pip packages we'll need."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "! pip install \"guardrails-ai[databricks]\" -q"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, we'll ensure the Guardrails CLI is properly configured.  Specifically we want to use remote inferencing for one of the ML backed validators we will be using."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "SUCCESS:guardrails-cli:\n",
      "            Login successful.\n",
      "\n",
      "            Get started by installing our RegexMatch validator:\n",
      "            https://hub.guardrailsai.com/validator/guardrails_ai/regex_match\n",
      "\n",
      "            You can install it by running:\n",
      "            guardrails hub install hub://guardrails/regex_match\n",
      "\n",
      "            Find more validators at https://hub.guardrailsai.com\n",
      "            \n"
     ]
    }
   ],
   "source": [
    "! guardrails configure --enable-metrics --token $GUARDRAILS_TOKEN --enable-remote-inferencing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Finally, we'll install some validators from the Guardrails Hub."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Installing hub:\u001b[35m/\u001b[0m\u001b[35m/tryolabs/\u001b[0m\u001b[95mrestricttotopic...\u001b[0m\n",
      "✅Successfully installed tryolabs/restricttotopic!\n",
      "\n",
      "\n",
      "Installing hub:\u001b[35m/\u001b[0m\u001b[35m/guardrails/\u001b[0m\u001b[95mvalid_length...\u001b[0m\n",
      "✅Successfully installed guardrails/valid_length!\n",
      "\n",
      "\n"
     ]
    }
   ],
   "source": [
    "! guardrails hub install hub://tryolabs/restricttotopic --no-install-local-models --quiet\n",
    "! guardrails hub install hub://guardrails/valid_length --quiet"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Starting the MLflow Tracking Server\n",
    "\n",
    "Our next step is to start the MLflow Tracking server.  This stands up both the telemetry sink we will send traces to, as well as the web interface we can use to examine them.  You'll need to run this next step is a separate terminal since, otherwise, the server's processes will block execution of the conesecutive cells in this notebook (which is normal)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Run this in the terminal or this cell will block the rest of the notebook\n",
    "# ! mlflow server --host localhost --port 8080"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Creating and Instrumenting our Guard\n",
    "\n",
    "Next up, we'll instrument the Guardrails package to send traces to the MLflow Tracking Server as well as setup our LLM and Guard. \n",
    "\n",
    "As of `guardrails-ai` version 0.5.8, we offer a builtin instrumentor for MLflow."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import mlflow\n",
    "from guardrails.integrations.databricks import MlFlowInstrumentor\n",
    "\n",
    "mlflow.set_tracking_uri(uri=\"http://localhost:8080\")\n",
    "\n",
    "MlFlowInstrumentor(experiment_name=\"My First Experiment\").instrument()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This instrumentor wraps some of the key functions and flows within Guardrails and automatically captures trace data when the Guard is run.\n",
    "\n",
    "Now that the Guardrails package is instrumented, we can create our Guard."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "from guardrails import Guard\n",
    "from guardrails.hub import RestrictToTopic, ValidLength\n",
    "\n",
    "guard = Guard(name=\"content-guard\").use_many(\n",
    "    RestrictToTopic(\n",
    "        valid_topics=[\"computer programming\", \"computer science\", \"algorithms\"],\n",
    "        disable_llm=True,\n",
    "        on_fail=\"exception\",\n",
    "    ),\n",
    "    ValidLength(min=1, max=150, on_fail=\"exception\"),\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this example, we have created a Guard that uses two Validators: RestrictToTopic and ValidLength. The RestrictToTopic Validator ensures that the text is related to the topics we specify, while the ValidLength Guardrail ensures that the text stays within our character limit."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Testing and Tracking our Guard\n",
    "Next we'll test our our Guard by calling an LLM and letting the Guard validate the output.  After each execution, we'll look at the trace data collected by MLflow Tracking Server."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "# Setup some environment variables for the LLM\n",
    "os.environ[\"DATABRICKS_API_KEY\"] = os.environ.get(\n",
    "    \"DATABRICKS_TOKEN\", \"your-databricks-key\"\n",
    ")\n",
    "os.environ[\"DATABRICKS_API_BASE\"] = os.environ.get(\n",
    "    \"DATABRICKS_HOST\", \"https://abc-123ab12a-1234.cloud.databricks.com\"\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, we'll give the LLM an easy prompt that should result in an output that passes validation.  Consider this our happy path test."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"> ================== Validated LLM output ================== \n",
       "</pre>\n"
      ],
      "text/plain": [
       " ================== Validated LLM output ================== \n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008000; text-decoration-color: #008000\">\"Recursion: A method solving problems by solving smaller instances, calling itself with reduced input until </span>\n",
       "<span style=\"color: #008000; text-decoration-color: #008000\">reaching a base case.\"</span>\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[32m\"Recursion: A method solving problems by solving smaller instances, calling itself with reduced input until \u001b[0m\n",
       "\u001b[32mreaching a base case.\"\u001b[0m\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "from rich import print\n",
    "\n",
    "instructions = {\n",
    "    \"role\": \"system\",\n",
    "    \"content\": \"You are a helpful assistant that gives advice about writing clean code and other programming practices.\",\n",
    "}\n",
    "prompt = \"Write a short summary about recursion in less than 100 characters.\"\n",
    "\n",
    "try:\n",
    "    result = guard(\n",
    "        model=\"databricks/databricks-dbrx-instruct\",\n",
    "        messages=[instructions, {\"role\": \"user\", \"content\": prompt}],\n",
    "    )\n",
    "\n",
    "    print(\" ================== Validated LLM output ================== \")\n",
    "    print(result.validated_output)\n",
    "except Exception as e:\n",
    "    print(\"Oops! That didn't go as planned...\")\n",
    "    print(e)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If we navigate to http://localhost:8080 in our browser we can see our experiemnt, `My First Experiment`, in the list on the left hand side.  If we select our experiment, and then select the `Traces` tab, we should see one trace from the cell we just ran.\n",
    "![Traces Landing Page](../assets/happy_path_traces_landing_page.png)\n",
    "\n",
    "If we select this trace, we see a breakdown of the various steps taken within the Guard on the left, including a timeline, and a details view for the selected span on the right.  If you click on the different spans within the trace, you can see different attributes specific to that span.  For example, if you click on `guardrails/guard/step/call`, the span that tracked the call to the LLM, you can see all of the parameters that were used to call the LLM, as well as all of the outputs from the LLM including token counts.\n",
    "![LLM Span Details View](../assets/llm_span.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Next, let's give the LLM a prompt that instructs it to output something that should fail.  Consider this our exception path test."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"> ================== LLM output ================== \n",
       "</pre>\n"
      ],
      "text/plain": [
       " ================== LLM output ================== \n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">In the realm where the green field doth lie,\n",
       "Where the sun shines bright and the sky's azure high,\n",
       "A game of skill, of strategy and might,\n",
       "Unfolds in innings, under the sun's warm light.\n",
       "\n",
       "Batter up, the crowd cheers with delight,\n",
       "As the pitcher winds up, with all his might,\n",
       "The ball whizzes fast, a blur of white,\n",
       "A dance of power, in the afternoon light.\n",
       "\n",
       "The bat meets ball, a crack, a sight,\n",
       "A thrill runs through, like an electric spike,\n",
       "The fielders scatter, in a frantic hike,\n",
       "To catch or miss, it's all in the strike.\n",
       "\n",
       "The bases loaded, the tension's tight,\n",
       "A single run could end the night,\n",
       "The crowd holds breath, in anticipation's height,\n",
       "For the game's outcome, in this baseball fight.\n",
       "\n",
       "The outfielder leaps, with all his height,\n",
       "A catch or miss, could decide the plight,\n",
       "The ball falls short, in the glove's tight knit,\n",
       "A collective sigh, as the inning's writ.\n",
       "\n",
       "The game goes on, through day and night,\n",
       "A battle of wills, in the stadium's light,\n",
       "A symphony of plays, in the diamond's sight,\n",
       "A poem of baseball, in black and white.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "In the realm where the green field doth lie,\n",
       "Where the sun shines bright and the sky's azure high,\n",
       "A game of skill, of strategy and might,\n",
       "Unfolds in innings, under the sun's warm light.\n",
       "\n",
       "Batter up, the crowd cheers with delight,\n",
       "As the pitcher winds up, with all his might,\n",
       "The ball whizzes fast, a blur of white,\n",
       "A dance of power, in the afternoon light.\n",
       "\n",
       "The bat meets ball, a crack, a sight,\n",
       "A thrill runs through, like an electric spike,\n",
       "The fielders scatter, in a frantic hike,\n",
       "To catch or miss, it's all in the strike.\n",
       "\n",
       "The bases loaded, the tension's tight,\n",
       "A single run could end the night,\n",
       "The crowd holds breath, in anticipation's height,\n",
       "For the game's outcome, in this baseball fight.\n",
       "\n",
       "The outfielder leaps, with all his height,\n",
       "A catch or miss, could decide the plight,\n",
       "The ball falls short, in the glove's tight knit,\n",
       "A collective sigh, as the inning's writ.\n",
       "\n",
       "The game goes on, through day and night,\n",
       "A battle of wills, in the stadium's light,\n",
       "A symphony of plays, in the diamond's sight,\n",
       "A poem of baseball, in black and white.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "\n",
       " ================== Validation Errors ================== \n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n",
       "\n",
       " ================== Validation Errors ================== \n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "RestrictToTopic: No valid topic was found.\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n",
       "RestrictToTopic: No valid topic was found.\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "prompt = \"Write a really long poem about baseball.\"\n",
    "\n",
    "try:\n",
    "    result = guard(\n",
    "        model=\"databricks/databricks-dbrx-instruct\",\n",
    "        messages=[instructions, {\"role\": \"user\", \"content\": prompt}],\n",
    "    )\n",
    "\n",
    "    print(\"This success was unexpected. Let's look at the output to see why it passed.\")\n",
    "    print(result.validated_output)\n",
    "except Exception:\n",
    "    # Great! It failed just like we expected it to!\n",
    "    # First, let's look at what the LLM generated.\n",
    "    print(\" ================== LLM output ================== \")\n",
    "    print(guard.history.last.raw_outputs.last)\n",
    "\n",
    "    # Next, let's examine the validation errors\n",
    "    print(\"\\n\\n ================== Validation Errors ================== \")\n",
    "    for failed_validation in guard.history.last.failed_validations:\n",
    "        print(\n",
    "            f\"\\n{failed_validation.validator_name}: {failed_validation.validation_result.error_message}\"\n",
    "        )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First note that there is only one failed validator in the logs: `RestrictToTopic`.  This is because since we set `on_fail=\"exception\"`, the first failure to occur will raise an exception and interrupt the process.  If we set our OnFail action to a different value, like `noop`, we would also see a log for `ValidLength` since the LLM's output is clearly longer than the max length we specified.\n",
    "\n",
    "If navigate back to the MLflow UI in our browser, we see another trace.  Since this last cell raised an exception, we see that the status is listed as `Error`.\n",
    "![Traces Landing Page With Exception](../assets/exception_path_landing_page.png)\n",
    "\n",
    "If we open this new trace we see, just like in the history logs, only `RestrictToTopic` has a recorded span.  This is, again, because it raised an exception on failure exitting the validation loop early.\n",
    "\n",
    "If we click on the validator's span, and scroll down to the bottom of its details panel, we can see the reason why validation failed: `\"No valid topic was found.\"`\n",
    "![Exception Path Trace](../assets/exception_path_trace.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Conclusion\n",
    "\n",
    "With Guardrails, MLflow, and the Guardrails MLflowInstrumentor, we can easily monitor both our LLMs and the validations we're guarding them with.  To learn more, check out [Guardrails AI](https://www.guardrailsai.com/) and [MLflow](https://mlflow.org/docs/latest/index.html)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": ".venv",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
